99 research outputs found
Three-Body and One-Body Channels of the Auger Core-Valence-Valence decay: Simplified Approach
We propose a computationally simple model of Auger and APECS line shapes from
open-band solids. Part of the intensity comes from the decay of unscreened
core-holes and is obtained by the two-body Green's function ,
as in the case of filled bands. The rest of the intensity arises from screened
core-holes and is derived using a variational description of the relaxed ground
state; this involves the two-holes-one-electron propagator , which
also contains one-hole contributions. For many transition metals, the two-hole
Green's function can be well described by the Ladder
Approximation, but the three-body Green's function poses serious further
problems. To calculate , treating electrons and holes on equal
footing, we propose a practical approach to sum the series to all orders. We
achieve that by formally rewriting the problem in terms of a fictitious
three-body interaction. Our method grants non-negative densities of states,
explains the apparent negative-U behavior of the spectra of early transition
metals and interpolates well between weak and strong coupling, as we
demonstrate by test model calculations.Comment: AMS-LaTeX file, 23 pages, 8 eps and 3 ps figures embedded in the text
with epsfig.sty and float.sty, submitted to Phys. Rev.
Sparse Graph Learning from Spatiotemporal Time Series
Outstanding achievements of graph neural networks for spatiotemporal time
series analysis show that relational constraints introduce an effective
inductive bias into neural forecasting architectures. Often, however, the
relational information characterizing the underlying data-generating process is
unavailable and the practitioner is left with the problem of inferring from
data which relational graph to use in the subsequent processing stages. We
propose novel, principled - yet practical - probabilistic score-based methods
that learn the relational dependencies as distributions over graphs while
maximizing end-to-end the performance at task. The proposed graph learning
framework is based on consolidated variance reduction techniques for Monte
Carlo score-based gradient estimation, is theoretically grounded, and, as we
show, effective in practice. In this paper, we focus on the time series
forecasting problem and show that, by tailoring the gradient estimators to the
graph learning problem, we are able to achieve state-of-the-art performance
while controlling the sparsity of the learned graph and the computational
scalability. We empirically assess the effectiveness of the proposed method on
synthetic and real-world benchmarks, showing that the proposed solution can be
used as a stand-alone graph identification procedure as well as a graph
learning component of an end-to-end forecasting architecture.Comment: updated and extended versio
Learning to Reconstruct Missing Data from Spatiotemporal Graphs with Sparse Observations
Modeling multivariate time series as temporal signals over a (possibly
dynamic) graph is an effective representational framework that allows for
developing models for time series analysis. In fact, discrete sequences of
graphs can be processed by autoregressive graph neural networks to recursively
learn representations at each discrete point in time and space. Spatiotemporal
graphs are often highly sparse, with time series characterized by multiple,
concurrent, and long sequences of missing data, e.g., due to the unreliable
underlying sensor network. In this context, autoregressive models can be
brittle and exhibit unstable learning dynamics. The objective of this paper is,
then, to tackle the problem of learning effective models to reconstruct, i.e.,
impute, missing data points by conditioning the reconstruction only on the
available observations. In particular, we propose a novel class of
attention-based architectures that, given a set of highly sparse discrete
observations, learn a representation for points in time and space by exploiting
a spatiotemporal propagation architecture aligned with the imputation task.
Representations are trained end-to-end to reconstruct observations w.r.t. the
corresponding sensor and its neighboring nodes. Compared to the state of the
art, our model handles sparse data without propagating prediction errors or
requiring a bidirectional model to encode forward and backward time
dependencies. Empirical results on representative benchmarks show the
effectiveness of the proposed method.Comment: Accepted at NeurIPS 202
Graph-based Time Series Clustering for End-to-End Hierarchical Forecasting
Existing relationships among time series can be exploited as inductive biases
in learning effective forecasting models. In hierarchical time series,
relationships among subsets of sequences induce hard constraints (hierarchical
inductive biases) on the predicted values. In this paper, we propose a
graph-based methodology to unify relational and hierarchical inductive biases
in the context of deep learning for time series forecasting. In particular, we
model both types of relationships as dependencies in a pyramidal graph
structure, with each pyramidal layer corresponding to a level of the hierarchy.
By exploiting modern - trainable - graph pooling operators we show that the
hierarchical structure, if not available as a prior, can be learned directly
from data, thus obtaining cluster assignments aligned with the forecasting
objective. A differentiable reconciliation stage is incorporated into the
processing architecture, allowing hierarchical constraints to act both as an
architectural bias as well as a regularization element for predictions.
Simulation results on representative datasets show that the proposed method
compares favorably against the state of the art
Filling the G_ap_s: Multivariate Time Series Imputation by Graph Neural Networks
Dealing with missing values and incomplete time series is a labor-intensive,
tedious, inevitable task when handling data coming from real-world
applications. Effective spatio-temporal representations would allow imputation
methods to reconstruct missing temporal data by exploiting information coming
from sensors at different locations. However, standard methods fall short in
capturing the nonlinear time and space dependencies existing within networks of
interconnected sensors and do not take full advantage of the available - and
often strong - relational information. Notably, most state-of-the-art
imputation methods based on deep learning do not explicitly model relational
aspects and, in any case, do not exploit processing frameworks able to
adequately represent structured spatio-temporal data. Conversely, graph neural
networks have recently surged in popularity as both expressive and scalable
tools for processing sequential data with relational inductive biases. In this
work, we present the first assessment of graph neural networks in the context
of multivariate time series imputation. In particular, we introduce a novel
graph neural network architecture, named GRIN, which aims at reconstructing
missing data in the different channels of a multivariate time series by
learning spatio-temporal representations through message passing. Empirical
results show that our model outperforms state-of-the-art methods in the
imputation task on relevant real-world benchmarks with mean absolute error
improvements often higher than 20%.Comment: Accepted at ICLR 202
Underactuated Attitude Control with Deep Reinforcement Learning
Autonomy is a key challenge for future space exploration endeavors. Deep Reinforcement Learning holds the promises for developing agents able to learn complex behaviors simply by interacting with their environment. This work investigates the use of Reinforcement Learning for satellite attitude control applied to two working conditions: the nominal case, in which all the actuators (a set of 3 reaction wheels) are working properly, and the underactuated case, where an actuator failure is simulated randomly along one of the axes. In particular, a control policy is implemented and evaluated to maneuver a small satellite from a random starting angle to a given pointing target. In the proposed approach, the control policies are implemented as Neural Networks trained with a custom version of the Proximal Policy Optimization algorithm, and they allow the designer to specify the desired control properties by simply shaping the reward function. The agents learn to effectively perform large-angle slew maneuvers with fast convergence and industry-standard pointing accuracy
Deep Reinforcement Learning with Weighted Q-Learning
Overestimation of the maximum action-value is a well-known problem that
hinders Q-Learning performance, leading to suboptimal policies and unstable
learning. Among several Q-Learning variants proposed to address this issue,
Weighted Q-Learning (WQL) effectively reduces the bias and shows remarkable
results in stochastic environments. WQL uses a weighted sum of the estimated
action-values, where the weights correspond to the probability of each
action-value being the maximum; however, the computation of these probabilities
is only practical in the tabular settings. In this work, we provide the
methodological advances to benefit from the WQL properties in Deep
Reinforcement Learning (DRL), by using neural networks with Dropout Variational
Inference as an effective approximation of deep Gaussian processes. In
particular, we adopt the Concrete Dropout variant to obtain calibrated
estimates of epistemic uncertainty in DRL. We show that model uncertainty in
DRL can be useful not only for action selection, but also action evaluation. We
analyze how the novel Weighted Deep Q-Learning algorithm reduces the bias
w.r.t. relevant baselines and provide empirical evidence of its advantages on
several representative benchmarks.Comment: Corrected typo
Graph Deep Learning for Time Series Forecasting
Graph-based deep learning methods have become popular tools to process
collections of correlated time series. Differently from traditional
multivariate forecasting methods, neural graph-based predictors take advantage
of pairwise relationships by conditioning forecasts on a (possibly dynamic)
graph spanning the time series collection. The conditioning can take the form
of an architectural inductive bias on the neural forecasting architecture,
resulting in a family of deep learning models called spatiotemporal graph
neural networks. Such relational inductive biases enable the training of global
forecasting models on large time-series collections, while at the same time
localizing predictions w.r.t. each element in the set (i.e., graph nodes) by
accounting for local correlations among them (i.e., graph edges). Indeed,
recent theoretical and practical advances in graph neural networks and deep
learning for time series forecasting make the adoption of such processing
frameworks appealing and timely. However, most of the studies in the literature
focus on proposing variations of existing neural architectures by taking
advantage of modern deep learning practices, while foundational and
methodological aspects have not been subject to systematic investigation. To
fill the gap, this paper aims to introduce a comprehensive methodological
framework that formalizes the forecasting problem and provides design
principles for graph-based predictive models and methods to assess their
performance. At the same time, together with an overview of the field, we
provide design guidelines, recommendations, and best practices, as well as an
in-depth discussion of open challenges and future research directions
Taming Local Effects in Graph-based Spatiotemporal Forecasting
Spatiotemporal graph neural networks have shown to be effective in time
series forecasting applications, achieving better performance than standard
univariate predictors in several settings. These architectures take advantage
of a graph structure and relational inductive biases to learn a single (global)
inductive model to predict any number of the input time series, each associated
with a graph node. Despite the gain achieved in computational and data
efficiency w.r.t. fitting a set of local models, relying on a single global
model can be a limitation whenever some of the time series are generated by a
different spatiotemporal stochastic process. The main objective of this paper
is to understand the interplay between globality and locality in graph-based
spatiotemporal forecasting, while contextually proposing a methodological
framework to rationalize the practice of including trainable node embeddings in
such architectures. We ascribe to trainable node embeddings the role of
amortizing the learning of specialized components. Moreover, embeddings allow
for 1) effectively combining the advantages of shared message-passing layers
with node-specific parameters and 2) efficiently transferring the learned model
to new node sets. Supported by strong empirical evidence, we provide insights
and guidelines for specializing graph-based models to the dynamics of each time
series and show how this aspect plays a crucial role in obtaining accurate
predictions.Comment: Accepted at NeurIPS 202
- …